Code Positioning for VLIW Architectures
نویسندگان
چکیده
Several studies have considered reducing instruction cache misses and branch penalty stall cycles by means of various forms of code placement. Most proposed approaches rearrange procedures or basic blocks in order to speed up execution on sequential architectures with branch prediction. Moreover, most works focus mainly on instruction cache performance and disregard execution cycles. To the best of our knowledge, no work has specifically addressed statically scheduled ILP machines like VLIWs, with control-transfer delay slots. We propose a new code positioning algorithm especially designed for VLIW-style architectures, which allows to trade off tighter schedule for program locality. Our measurements indicate that code positioning, as a result of tighter program schedule and removed unconditional jumps, can significantly reduce the number of execution cycles, by up to 21%, while improving program locality and instruction cache performance.
منابع مشابه
Some Design Aspects for VLIW Architectures Exploiting Fine - Grained Parallelism
Very Long Instruction Word Architectures (VLIW architectures) can exploit the ne{grained (instruction level) parallelism typically found in sequential{natured program code. A parallelizing compiler is used to restructure the program code. Sophisticated global compaction techniques have emerged that can e ectively extract ne{grained parallelism from ordinary sequential natured program code. In t...
متن کاملDynamically Trace Scheduled VLIW Architectures
This paper presents a new architecture organisation, the dynamically trace scheduled VLIW (DTSVLIW), that can be used to implement machines that execute the code of current RISC or CISC instruction set architectures in a VLIW fashion, with backward code compatibility.
متن کاملExecution-Based Scheduling for VLIW Architectures
We describe a new dynamic software scheduling technique for VLIW architectures, which compiles into VLIW code the program paths that are actually executed. Unlike trace processors, or DIF, the technique executes operations speculatively on multiple paths through the code, is resilient to branch mispredictions, and can achieve very large dynamic window sizes necessary for high ILP. Aggressive op...
متن کاملSimultaneous MultiStreaming for Complexity-Effective VLIW Architectures
Very Long Instruction Word (VLIW) architectures exploit instruction level parallelism (ILP) with the help of the compiler to achieve higher instruction throughput with minimal hardware. However, control and data dependencies between operations limit the available ILP, which not only hinders the scalability of VLIW architectures, but also result in code size expansion. Although speculation and p...
متن کاملEffect of Multicycle Intructions on the Integer Performance of the Dynamixcally Trace Scheduled VLIW Architecture
Dynamically trace scheduled VLIW (DTSVLIW) architectures can be used to implement machines that execute code of current RISC or CISC instruction set architectures in a VLIW fashion, delivering instruction level parallelism (ILP) with backward code compatibility. This paper presents the effect of multicycle instructions on the performance of a DTSVLIW architecture running the SPECint95 benchmarks.
متن کامل